AITopics | gaze estimation

Collaborating Authors

gaze estimation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SLYKLatent: A Learning Framework for Gaze Estimation Using Deep Facial Feature Learning

Adebayo, Samuel, Dessing, Joost C., McLoone, Seán

arXiv.org Artificial IntelligenceNov-6-2025

In this research, we present SLYKLatent, a novel approach for enhancing gaze estimation by addressing appearance instability challenges in datasets due to aleatoric uncertainties, covariant shifts, and test domain generalization. SLYKLatent utilizes Self-Supervised Learning for initial training with facial expression datasets, followed by refinement with a patch-based tri-branch network and an inverse explained variance-weighted training loss function. Our evaluation on benchmark datasets achieves a 10.9% improvement on Gaze360, supersedes top MPIIFaceGaze results with 3.8%, and leads on a subset of ETH-XGaze by 11.6%, surpassing existing methods by significant margins. Adaptability tests on RAF-DB and Affectnet show 86.4% and 60.9% accuracies, respectively. Ablation studies confirm the effectiveness of SLYKLatent's novel components.

artificial intelligence, estimation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/THMS.2025.3553404

2402.01555

Country:

Europe > United Kingdom (0.46)
Africa > Nigeria (0.28)

Genre: Research Report > Promising Solution (0.48)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(2 more...)

Add feedback

Gaze Estimation for Human-Robot Interaction: Analysis Using the NICO Platform

Palider, Matej, Eldardeer, Omar, Kocur, Viktor

arXiv.org Artificial IntelligenceOct-9-2025

This is mainly because of the importance of the gaze as a social non-verbal cue for interaction [3]. It drives many different social cognitive mechanisms (such as joint attention, intention prediction, and task coordination) and provides an explainable behaviour for others [4,5]. Affective states are also represented in the gaze behaviour [6]. The ability to perceive and understand the social cues affects the effectiveness and efficiency of the whole interaction experience. Gaze understanding and following is one of the earliest behavior mechanisms developed by infants to engage in different social communication scenarios [7]. Therefore, achieving high accuracy in gaze estimation is a key enabler to reach a seamless Human-Robot interaction task. Despite the significant progress of gaze estimation methodologies, these methods remain not fully evaluated in real human-robot interaction scenarios. In this paper we present an applied evaluation for the latest gaze estimation methods in a standard HRI scenario, specifically when the human and the robot are engaged in a shared task space (e.g., table surface).

artificial intelligence, estimation, estimation method, (14 more...)

arXiv.org Artificial Intelligence

2509.24001

Country: Europe (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.92)

Add feedback

WEBEYETRACK: Scalable Eye-Tracking for the Browser via On-Device Few-Shot Personalization

Davalos, Eduardo, Zhang, Yike, Srivastava, Namrata, Thatigotla, Yashvitha, Salas, Jorge A., McFadden, Sara, Cho, Sun-Joo, Goodwin, Amanda, TS, Ashwin, Biswas, Gautam

arXiv.org Artificial IntelligenceAug-28-2025

With advancements in AI, new gaze estimation methods are exceeding state-of-the-art (SOTA) benchmarks, but their real-world application reveals a gap with commercial eye-tracking solutions. Factors like model size, inference time, and privacy often go unaddressed. Meanwhile, webcam-based eye-tracking methods lack sufficient accuracy, in particular due to head movement. To tackle these issues, we introduce We bEyeTrack, a framework that integrates lightweight SOTA gaze estimation models directly in the browser. It incorporates model-based head pose estimation and on-device few-shot learning with as few as nine calibration samples (k < 9). WebEyeTrack adapts to new users, achieving SOTA performance with an error margin of 2.32 cm on GazeCapture and real-time inference speeds of 2.4 milliseconds on an iPhone 14. Our open-source code is available at https://github.com/RedForestAi/WebEyeTrack.

artificial intelligence, estimation, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2508.19544

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry:

Information Technology (0.93)
Government > Regional Government (0.47)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

DMAGaze: Gaze Estimation Based on Feature Disentanglement and Multi-Scale Attention

Chen, Haohan, Liu, Hongjia, Lan, Shiyong, Wang, Wenwu, Qiao, Yixin, Li, Yao, Deng, Guonan

arXiv.org Artificial IntelligenceMay-27-2025

Gaze estimation, which predicts gaze direction, commonly faces the challenge of interference from complex gaze-irrelevant information in face images. In this work, we propose DMAGaze, a novel gaze estimation framework that exploits information from facial images in three aspects: gaze-relevant global features (disentangled from facial image), local eye features (extracted from cropped eye patch), and head pose estimation features, to improve overall performance. Furthermore, we introduce a new cascaded attention module named Multi-Scale Global Local Attention Module (MS-GLAM). Through a customized cascaded attention structure, it e ffectively focuses on global and local information at multiple scales, further enhancing the information from the Disentangler. Finally, the global gaze-relevant features disentangled by the upper face branch, combined with head pose and local eye features, are passed through the detection head for high-precision gaze estimation. Our proposed DMAGaze has been extensively validated on two mainstream public datasets, achieving state-of-the-art performance. Keywords: gaze estimation, feature disentanglement, Gaussian similarity, multi-scale attention1. Introduction Gaze estimation, the task of predicting gaze direction, crucial for measuring human attention, is widely applied in areas like saliency detection[1, 2], virtual reality[3], driver distraction monitoring[4], human-computer interaction[5] and autism diagnosis[6]. Recently, gaze estimation has shifted from model-based methods to appearance-based methods.

artificial intelligence, estimation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2504.1116

Country:

Europe > United Kingdom (0.28)
Europe > Finland (0.28)
Asia > China (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.54)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Gaze estimation learning architecture as support to affective, social and cognitive studies in natural human-robot interaction

Lombardi, Maria, Maiettini, Elisa, Wykowska, Agnieszka, Natale, Lorenzo

arXiv.org Artificial IntelligenceOct-25-2024

Gaze is a crucial social cue in any interacting scenario and drives many mechanisms of social cognition (joint and shared attention, predicting human intention, coordination tasks). Gaze direction is an indication of social and emotional functions affecting the way the emotions are perceived. Evidence shows that embodied humanoid robots endowing social abilities can be seen as sophisticated stimuli to unravel many mechanisms of human social cognition while increasing engagement and ecological validity. In this context, building a robotic perception system to automatically estimate the human gaze only relying on robot's sensors is still demanding. Main goal of the paper is to propose a learning robotic architecture estimating the human gaze direction in table-top scenarios without any external hardware. Table-top tasks are largely used in many studies in experimental psychology because they are suitable to implement numerous scenarios allowing agents to collaborate while maintaining a face-to-face interaction. Such an architecture can provide a valuable support in studies where external hardware might represent an obstacle to spontaneous human behaviour, especially in environments less controlled than the laboratory (e.g., in clinical settings). A novel dataset was also collected with the humanoid robot iCub, including images annotated from 24 participants in different gaze conditions.

architecture, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.19374

Country:

Europe > Italy > Liguria > Genoa (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Cross-Dataset Gaze Estimation by Evidential Inter-intra Fusion

Wang, Shijing, Huang, Yaping, Xie, Jun, YiTian, null, Chen, Feng, Wang, Zhepeng

arXiv.org Artificial IntelligenceSep-7-2024

Achieving accurate and reliable gaze predictions in complex and diverse environments remains challenging. Fortunately, it is straightforward to access diverse gaze datasets in real-world applications. We discover that training these datasets jointly can significantly improve the generalization of gaze estimation, which is overlooked in previous works. However, due to the inherent distribution shift across different datasets, simply mixing multiple dataset decreases the performance in the original domain despite gaining better generalization abilities. To address the problem of ``cross-dataset gaze estimation'', we propose a novel Evidential Inter-intra Fusion EIF framework, for training a cross-dataset model that performs well across all source and unseen domains. Specifically, we build independent single-dataset branches for various datasets where the data space is partitioned into overlapping subspaces within each dataset for local regression, and further create a cross-dataset branch to integrate the generalizable features from single-dataset branches. Furthermore, evidential regressors based on the Normal and Inverse-Gamma (NIG) distribution are designed to additionally provide uncertainty estimation apart from predicting gaze. Building upon this foundation, our proposed framework achieves both intra-evidential fusion among multiple local regressors within each dataset and inter-evidential fusion among multiple branches by Mixture \textbfof Normal Inverse-Gamma (MoNIG distribution. Experiments demonstrate that our method consistently achieves notable improvements in both source domains and unseen domains.

dataset, estimation, source domain, (12 more...)

arXiv.org Artificial Intelligence

2409.04766

Country: Asia > China > Beijing > Beijing (0.05)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Gaze Estimation on Spresense

Ruegg, Thomas, Bonazzi, Pietro, Ronco, Andrea

arXiv.org Artificial IntelligenceNov-20-2023

Gaze estimation is a valuable technology with numerous applications in fields such as human-computer interaction, virtual reality, and medicine. This report presents the implementation of a gaze estimation system using the Sony Spresense microcontroller board and explores its performance in latency, MAC/cycle, and power consumption. The report also provides insights into the system's architecture, including the gaze estimation model used. Additionally, a demonstration of the system is presented, showcasing its functionality and performance. Our lightweight model TinyTrackerS is a mere 169Kb in size, using 85.8k parameters and runs on the Spresense platform at 3 FPS.

estimation system, gaze estimation, spresense, (13 more...)

arXiv.org Artificial Intelligence

2308.12313

Country:

Europe > Switzerland > Zürich > Zürich (0.24)
North America > United States (0.05)

Genre: Research Report (0.40)

Technology:

Information Technology > Human Computer Interaction > Interfaces (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

Semi-Synthetic Dataset Augmentation for Application-Specific Gaze Estimation

Leblond-Menard, Cedric, Picard-Krashevski, Gabriel, Achiche, Sofiane

arXiv.org Artificial IntelligenceOct-27-2023

Although the number of gaze estimation datasets is growing, the application of appearance-based gaze estimation methods is mostly limited to estimating the point of gaze on a screen. This is in part because most datasets are generated in a similar fashion, where the gaze target is on a screen close to camera's origin. In other applications such as assistive robotics or marketing research, the 3D point of gaze might not be close to the camera's origin, meaning models trained on current datasets do not generalize well to these tasks. We therefore suggest generating a textured tridimensional mesh of the face and rendering the training images from a virtual camera at a specific position and orientation related to the application as a mean of augmenting the existing datasets. In our tests, this lead to an average 47% decrease in gaze estimation angular error.

application, dataset, gaze estimation, (15 more...)

arXiv.org Artificial Intelligence

2310.18469

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.88)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.34)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.31)

Add feedback

Contrastive Representation Learning for Gaze Estimation

Jindal, Swati, Manduchi, Roberto

arXiv.org Artificial IntelligenceOct-24-2022

Self-supervised learning (SSL) has become prevalent for learning representations in computer vision. Notably, SSL exploits contrastive learning to encourage visual representations to be invariant under various image transformations. The task of gaze estimation, on the other hand, demands not just invariance to various appearances but also equivariance to the geometric transformations. In this work, we propose a simple contrastive representation learning framework for gaze estimation, named Gaze Contrastive Learning (GazeCLR). GazeCLR exploits multi-view data to promote equivariance and relies on selected data augmentation techniques that do not alter gaze directions for invariance learning. Our experiments demonstrate the effectiveness of GazeCLR for several settings of the gaze estimation task. Particularly, our results show that GazeCLR improves the performance of cross-domain gaze estimation and yields as high as 17.2% relative improvement. Moreover, the GazeCLR framework is competitive with state-of-the-art representation learning methods for few-shot evaluation. The code and pre-trained models are available at https://github.com/jswati31/gazeclr.

artificial intelligence, contrastive representation learning, machine learning, (1 more...)

arXiv.org Artificial Intelligence

2210.13404

Genre: Research Report > New Finding (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Eye Gaze Estimation Model Analysis

Kottwani, Aveena, Kumar, Ayush

arXiv.org Artificial IntelligenceJul-28-2022

We explore techniques for eye gaze estimation using machine learning. Eye gaze estimation is a common problem for various behavior analysis and human-computer interfaces. The purpose of this work is to discuss various model types for eye gaze estimation and present the results from predicting gaze direction using eye landmarks in unconstrained settings. In unconstrained real-world settings, feature-based and model-based methods are outperformed by recent appearance-based methods due to factors like illumination changes and other visual artifacts. We discuss a learning-based method for eye region landmark localization trained exclusively on synthetic data. We discuss how to use detected landmarks as input to iterative model-fitting and lightweight learning-based gaze estimation methods and how to use the model for person-independent and personalized gaze estimations.

artificial intelligence, estimation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.13140/RG.2.2.22546.99522

2207.14373

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)

Add feedback